Profiling and Optimizing Micro-Architecture Bottlenecks on the Hardware Level

نویسندگان

  • Francis B. Moreira
  • Marco A. Z. Alves
  • Matthias Diener
  • Philippe O. A. Navaux
  • Israel Koren
چکیده

Most mechanisms in current superscalar processors use instruction granularity information for speculation, such as branch predictors or prefetchers. However, many of these characteristics can be obtained at the basic block level, increasing the amount of code that can be covered while requiring less space to store the data. Furthermore, the code can be profiled more accurately and provide a higher variety of information by analyzing different instruction types inside a block. Because of these advantages, block-level analysis can provide more opportunities for mechanisms that use this information. For example, it is possible to integrate information of branch prediction and memory accesses to provide precise information for speculative mechanisms, increasing accuracy and performance. We propose BLAP, an online mechanism that profiles bottlenecks on the micro-architectural level, such as delinquent memory loads, hard-to-predict branches and contention for functional units. BLAP works on the basic block level, providing information that can be used to optimize these bottlenecks. A prefetch dropping mechanism and a memory controller policy were created to use the profiled information provided by BLAP. Together, these mechanisms are able to improve performance by up to 17.39% (3.9% on average). Our technique showed average gains of 13.14% when evaluated with higher memory pressure due to higher prefetch aggressivity.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A dynamic block-level execution profiler

Most performance enhancing mechanisms in current processors, such as branch predictors or prefetchers, rely on program characteristics monitored at the granularity of single instructions. However, many of these characteristics can be obtained at the basic block-level instead. The coarser granularity allows a larger portion of the code to be examined, enabling a more accurate profiling and a det...

متن کامل

A Study of the Performance Potential for Dynamic Instruction Hints Selection

Instruction hints have become an important way to communicate compile-time information to the hardware. They can be generated by the compiler and the post-link optimizer to reduce cache misses, improve branch prediction and minimize other performance bottlenecks. This paper discusses different instruction hints available on modern processor architectures and shows the potential performance impa...

متن کامل

Hardware in Loop of a Generalized Predictive Controller for a Micro Grid DC System of Renewable Energy Sources

In this paper, a hardware in the loop simulation (HIL) is presented. This application is purposed as the first step before a real implementation of a Generalized Predictive Control (GPC) on a micro-grid system located at the Military University Campus in Cajica, Colombia. The designed GPC, looks for keep the battery bank State of Charge (SOC) over the 70% and under the 90%, what ensures the bes...

متن کامل

A Novel Paradigm of Parallel Computation and its Use to Implement Simple High Performance Hardware

Communication mechanisms within concurrent computer systems are extremely hostile to optimizing compilers. Also vector machines have fundamental performance bottlenecks [33][35] and their sustained average performance is by several orders of magnitude lower, than their peak rate [15, 33], even when creative coding techniques help the compiler [34]. VLIW (Very Long Instruction Word) architecture...

متن کامل

Role of Compilers in Computer Architecture

A compiler is a software layer that helps the high level executions that are made in a programming language to be compiled and implemented by the underlying hardware computer architecture. Though a compiler is majorly designed based on the language specification, the hardware that it is going to implement on has a significant role in the compiler design. Effective compilers allow for a more eff...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014